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The deterministic dynamics of randomly connected neural networks are studied, where a state of 
binary neurons evolves according to a discrete-time synchronous update rule. We give a theoretical 
support that the overlap of systems’ states between the current and a previous time develops in time 
according to a Markovian stochastic process in large networks. This Markovian process predicts 
how often a network revisits one of previously visited states, depending on the system size. The 
state concentration probability, i.e., the probability that two distinct states co-evolve to the same 
state, is utilized to analytically derive various characteristics that quantify attractors’ structure. The 
analytical predictions about the total number of attractors, the typical cycle length, and the number 
of states belonging to all attractive cycles match well with numerical simulations for relatively large 
system sizes. 

PACS numbers: 02.50.Ey, 84.35.+i, 05.45.-a 


I. INTRODUCTION 

Neurons in the brain interact with each other in a heterogeneous and asymmetric way jl|, producing complex 
neuronal dynamics for information processing. In the past decades, there are a surge of research interests in randomly 
connected neural networks 0 - 0 - Although their behavior is described by simple deterministic equations, the resulting 
dynamics are rich, exhibiting fixed-point behavior, limit cycles, or high-dimensional chaos. These networks are capable 
of generating useful dynamic activitvpatterns after appropriate learning Si- 

Simple models of neural networks [10l - [l8i | have been explored to elucidate characteristics of their complex dynamics. 
In these networks, connections between binary neurons are independently drawn from an identical distribution, and 
the state of a network is updated simultaneously in discrete time steps without thermal noise. Thus, every initial 
configuration must evolve into an attractor, which is either a fixed point or a limit cycle. Because a fixed point is 
a limit cycle of length 1 = 1, the whole state space is divided into separated basins of attractions with heterogenous 
cycle len gths . Extensive numerical simulations were carried out to analyze the typical cycle length and the number 
of cycles [13, [lj| . The typical length of the cycles was observed to grow exponentially with the number of neurons 
n (such kinds of cycles are called chaotic attractors), and the total number of attractors increases linearly with n. 
These quantities were also analytically evaluated based on an empirical assumption that the dynamics loses memory 
of its non-immediate past ES- 

In this work, we develop a dynamic mean-field theory to characterize the attractors of the asymmetric neural 
network by extending the state concentration concept 17j, recently introduced to characterize the robustness and 
quickness of network’s transient dynamics. Our analysis estimates the (cumulative) distribution for the cycle length 
of attractors, the total number of attractors, and the volume of attractors in the state space. 

We remark that our work has three-fold contributions for understanding the statistical properties of the dynamics 
of randomly connected neural networks. First, a theoretical support for the Markovian property of state concentration 
dynamics (termed the annealed approximation in Ref. jlbj j ) is provided by computing the finite-size effect of the mean- 
field theory by explicitly evaluating the quenched randomness of network connections. Second, we provide a detailed 
picture about how state concentration happens in randomly connected neural networks. In particular, we quantify 
what is the characteristic distance that typically leads to state concentration and evaluate characteristic time scales 
underlying the state concentration dynamics. Finally, our theory gives a good consistency with numerical simulations 
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on the distribution of the cycle length, the typical cycle length, the number of cycles, and the total number of states 
belonging to all attractive cycles. These three contributions complement the previous studies in bee mm and 
provide deep insights towards the dynamics of randomly-connected neural networks. 

The paper is organized as follows. In Sec. UD we define the neural network model and its dynamics. Mean-field 
analysis is presented in detail in Sec. IIIII Results on the state concentration and statistical properties of attractors 
are discussed in Sec. m and Sec. 0 respectively. We summarize our results in Sec. EH 


II. MODEL DEFINITION 

We consider randomly connected neural networks consisting of n neurons (units). Each unit interacts with all the 
other units with an asymmetric coupling. We use Jij to represent the coupling strength from unit j to i, and Jij is 
independent of Jji (and others), and they follow the same Gaussian distribution with zero mean and variance 1 jn. 
The state of neuron i (i = 1,..., n) at time f + l(t = 0,l,...) is set according to the parallel deterministic dynamics 
in discrete time steps by its input hi(t) as 


<Ji{t + 1) = sgn {hi(t)) 


+1, (active state) 
— 1, (silent state) 


(1) 


where the input is defined by 


hi(t) = ^2 ( 2 ) 

3 =1 

Therefore, by combining Eqs. © and ©, the dynamics are summarized by <Ji(t + 1) = sgn^JT JijOjit in terms of 
the activity, or equivalently by hi(t + 1) = JT JySgn^ (£)) in terms of the input. 

We later compare the dynamics of randomly connected neural networks to that of random Boolean networks f2l)l.[2~lT , 
where each one of 2 n states er(i) = {ai(t)\i = 1,..., n} is randomly mapped to another. 

III. MEAN-FIELD ANALYSIS 

We study the dynamical evolution of the overlap between two states along a trajectory, expecting that its distribution 
across different realizations of {Jij} contains information about the structure of attractors. Let us define the overlap 
of two states, er (t) = {ai(t)\i = 1,..., n} and er(s) = {eq(s)|* = 1,..., n} along the same trajectory at different times 
t> s, by 



n 

J2<Ti{t)Gi{s). 

1=1 


( 3 ) 


This overlap takes +1 if two states are the same and —1 if one is the sign-flip of the other. The overlap takes a 
discrete value for a finite size network, but can be approximated as a continuous quantity in the large network size 
limit. The mean-field theory provides the dynamics of this overlap parameter and its fluctuation defined over the 
ensemble of random {<7^} (see Appendix [Aj . The stochastic dynamics of the overlap is well approximated for large n 
by a Markovian process 


Pt+l,s+l{o) ~ J W(q\q')P ts {q')dq', (4) 

where Pts(q) = Prob (qt s = q) is the probability of qt s = q■ The transition probability is approximated for large but 
finite n by a simple binomial distribution 


W(q\q') = 


( n ) 

\i+<pm 

n(l+«) 

2 


\n(l+q)/2j 

2 


2 


n(l-g) 


exp 


H(q) 


i + q ln i + f(q') 


i - q ln i - <pW) 


( 5 ) 
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where ip(q) = (2 /tt) arcsing and H (g) = — ^±2 i n 1±2 _ kzl i n iz 2 . Note that Eq. ([5]) summarizes the probability that 
n(l + q)/2 out of n neurons take the same sign in state cr(t + 1 ) and er(s + 1 ), given that n{ 1 + q')/2 out of n neurons 
take the same sign in the previous step. The binomial distribution in Eq. (0 suggests that the state overlap for each 
neuron is approximately independent, occurring with probability (1 + ip(q'))/2 (see Appendix [Al for a support). 

A similar expression is obtained for random Boolean networks by replacing ip{q) with v 3 BA r ( < ?) = , simply 

reflecting completely random nature of state transitions. 

It is worth noting that, the dynamics of the overlap becomes deterministic in the limit of large n according to the 
central limit theorem, which is the so called distance law [2214261] . g t+ i jS+ i = ip(q ts )- In this equation, the equality 
holds only at q = 0 and q = ±1, and otherwise |^>(g)| < |g|. Hence, in the limit of large n, the overlap monotonically 
converges to the stable solution of q = 0, implying that two distinct states would never converge. On the other hand, 
for finite n, the overlap fluctuates with amplitude ~ 1 / y/n about the deterministic solution (see detailed explanations 
in Appendix [A}. Thus, the overlap can evolve from q < 1 to q = 1 in time, indicating that system’s state eventually 
comes back to one of previously visited states in a finite network. 

The Markovian process of Eq. © sequentially provides P t+ i >t (q) for t = 1, 2,... for some positive time difference 
l = t — s given an initial distribution at t = 0 . The initial distribution is denoted by Pi,o(q) = Prob^o = <?)■ Since 
the initial state, cr(0), is selected randomly and independently from { ,Jjj }, we can set without losing generality the 
initial state to be eq(0) = 1 for all i (see Appendix iBl). In this case, the initial overlap of interest is expressed by 

1 n 

qi, o = - V'o-j(Z)o- i (0) 

i= 1 

1 n 

= -J2 s £ n (hi{l- 1 )). ( 6 ) 

i=l 

If l is small, Pi.o(q) reflects the memory of the initial state er(0) and is hard to evaluate exactly. However, if l is large, 
the mean-field result in Appendix [A] indicates that {hi{l — 1 )|* = 1,2,... ,n} follows approximately a zero-centered 
independent Gaussian distribution with unit variance in the large network-size limit. This means that the state 
overlap of Eq. © approaches a distribution centered around zero with variance ~ 1/n. In particular, P^o(q) tends 
for large l to a binomial distribution ( n / 1+ ” o ^ 2 ) 2 —rl , where the probability of qifi = ±1 is approximately 2~ n in the 
large network-size limit. We confirm this property later with numerical simulations. 


IV. STATE CONCENTRATION 

In this section, we consider how different states concentrate in time. The Markovian dynamics of Eq. © are 
completely characterized by the eigenvalues and eigenvectors of the transition probability W [27j. Let f a (q) and 
A a (< 1) respectively be the ath eigenvector and eigenvalue of W. We rank eigenvalues in a descending order, i.e., 
Ai > A 2 > • • • > A„+i (the number of possible values for q is n + 1). The distribution of the overlap is expressed by a 
weighted sum of the eigenvectors as 


n+1 

Pt+lAQ) = Y,( K) tA afa{q\ (7) 

a—1 

where {A a } is a set of initial coefficients that satisfies Pi,o{q) = S a ^a/a(g)- Hence, as the time step increases, 
Pt+i,t(q) becomes progressively dominated by the components with large eigenvalues. 

It is easy to see that W has two trivial eigenvectors fi(q) = 5 qt 1 and /2(g) = hj.-i with degenerate eigenvalues 
Ai = A 2 = 1. Note that 8 <hqo is the Kronecker delta function. The third eigenvector / 3 (g) is a non-trivial one and its 
eigenvalue A 3 ~ 1 — exp(—0.41n) exponentially approaches 1 with n (see Fig.|T|for the numerical result). The fourth 
eigenvalue converges to A 4 ~ 0.67 in the limit of large n. The half-decay time of the a-th component is described 
by these eigenvalues and given by t a = (ln2)/(— lnA a ), or equivalently, (A a ) ta = 1/2. There is a clear gap between 
the decay time of the third and fourth eigen-components. This result indicates that, for large n, the distribution 
of the overlap must approach quickly a quasi-stationary state P*(g) = J^_i A a f a {q) at around G ~ 1.73 and stay 
unchanged until k, 0.69exp(0.41n). In particular, the quasi-stationary state is characterized solely by / 3 (g) except 
at g = ± 1 . 

This analysis also suggests when the mean-field theory breaks down - the theory is not applicable once the third 
eigen-component significantly decays at around 1 3 . That is, after an exponential time of exp(0.41n), the distribution 
of the overlap becomes the linear combination of /i(g) and /2(g), be., every state becomes either the same or the 
sign-flip of the others. However, this never happens in a real system. 
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FIG. 1: (Color online) The non-trivial eigenvalues A 3 and A 4 of the transition matrix W. 


In the remaining part of this section, we characterize in more detail the quasi-stationary state in large n limit, from 
which we extract the structure of attractors. 

We first introduce an auxiliary notation 

a-t+iM) = ( 8 ) 

n 


where f exp (nat+i,t{q))dq = 1 according to the normalization constraint. With this notation, we can express the 
dynamics of Eq. 0 by 


cH+i+i,t+i{.q) 


-In f W (qlq^Pt+i^q^dq' 
n J 


H ( q ) + max 
q' 


1 + g l + (p(g') 1 - q 1 - y(g') 

2 n 2 2 n 2 


+ &t+i,t(q') , 


( 9 ) 


where the Laplace’s method was applied in the second line assuming large n. Note that, in the above expression, the 
maximizer q’ of the second term is a function of q. In particular, the well-defined asymptotic solution of Eq. be., 


a* (g) = H(q) + max 
1' 


1 + g 1 + ^g 7 ) 1 -g l-y(g') 

2 n 2 2 n 2 


+ <x*(q') , 


( 10 ) 


with finite a*(g) self-consistently provides the quasi-stationary state. Note that Eq. (1101) permits arbitrary discon¬ 
tinuity of Q<*(g) at q = ±1, reflecting that q = ±1 is the sink of the Markovian process. However, in the following 
analysis, we assume continuous a*(g). 

Next, we define index /3t+i,t(qts) = ^ lnProb^t+z^Igt+z+yt+i = 1) that characterizes the probability that two states 
cr(t + l ) and cr(t) have overlap qt+i,t before converging in the next step (q t +i+i t t+i = 1). This index is expressed, 
using the Bayes theorem, in terms of a by 

1 ^ W(l\q’)P t+l , t (q') 
n ff+z+i,t+i(l) 

l n 1 + ^ q ^ + a t+ i tt (q') - a t +z+i, t +i(l). (11) 

This means that, for large n, most of the trajectories that lead to state concentration had an overlap specified by the 
peak location of Pt+i,t, be., argmax g / j3t+i,t{q'), in the previous step. 

In the case of randomly connected neural networks studied here, /3t+i,t(q) has two peaks. As shown in Fig. [2](b), 
one peak is located at q = 1 reflecting the monotonic increase in ln 1+ ^ toward q = 1 and the other peak is located 
at q < 1 reflecting the peak of a t +i,t(q ) at q = 0 in Eq. (fill) . The q < 1 peak shifts to a larger positive value of q and 
its amplitude loses the dominance over the q = 1 peak as t increases because at+i,t(q) becomes blunt at large t (Fig. [2] 
(a)). The two peaks become comparable at around t^. In finite-size systems, the two peaks become indistinguishable 
once the difference of the peak values becomes less than 1/n. The result indicates that states concentrate mainly from 
q « 0.5 at the beginning but concentrate equally from q ft 0.75 and q = 1 at the quasi-stationary state. 


Pt+i,t(q') = 
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(a) (b) 

a 




FIG. 2: (Color online) The Markovian dynamics of at+i,t and /3t+i,t in time, (a) The index at+i,t characterizes the dynamics of 
distribution P(q t +i : t)- The line color changes from the lowest curve (the orange curve at t = 0, i.e., ai : o or /3i : o) to the yellow, 
and finally to the gray (the top curve at t = 10, i.e., Qi+io.io or /3i+io,io). (b) The index /3t+i,i characterizes the dynamics of 
distribution P(qt+i,t\qt+i+i,t+i = 1). The result indicates that states concentrate mainly from q ft 0.5 at the beginning but 
concentrate equally from q ft 0.75 and q = 1 at the quasi-stationary state. We used ai : o(q) = H(q) — In2 as the initial condition 
assuming no correlations at starting points. The results hold for any l > 1. 


These dynamics of the state overlap reflects the specific structure of attractors as we shall show below. In contrast to 
the above situation, for trivial dynamical systems that converge to a unique fixed-point ( e.g ., hi(t + 1 ) = (l + hj(t))/2), 
Pt+i,t(q) has a unique peak, which tends to approach q = 1 at large t , indicating that most states concentrate from 
nearby locations. On the other hand, in random Boolean networks, states concentrate randomly from any overlap 
values. Because most states are orthogonal to each other for large n , states mainly concentrate from <7 ft: 0 (see 
Appendix [C| . 


V. STATISTICAL PROPERTIES OF ATTRACTORS 


In this section, we analytically describe the statistical properties of attractors for randomly connected neural 
networks using the state concentration probability fl7l ]. The state concentration probability pt+i, s +i that characterizes 
the conditional probability of cr(t+ 1 ) = er(s + 1 ) given that no states up to time t along the trajectory are the same or 
the sign-flip of the others. Because of the symmetry, p t + i jS +i also characterizes the probability of cr(t + 1) = —<r(s + 1 ) 
given the same condition. Hence, 

Pt+ m+i = Prob(g t+M+ i = ±l\{q t ', s > ± ±l|t' < t,s' < t 1 }). (12) 

This state concentration probability is further approximated under the Markovian approximation of Eq. (© by 


Pt+i,s+i ~ / W(q t+ 1 ,,,-t-i — 1| qts)P(qts)dqt s 

= exp(?ra t+ i jS+ i(l)), (13) 

which directly follows from Eq. ©. Note that, based on the consideration of the previous section, we used in the 
second line that the result is not sensitive to the exclusion of q 1 = ±1 from the integral for large n. This is because 
the maxq/ in Eq. m is insensitive to its argument at q' = ±1 unless the initial distribution P^o(q) is sharply peaked 
at q = ±1, which is not the case here (c./. Eq. ©). 

Hence, based on the Markovian property, the probability that the dynamics starting from <r(0) comes back for the 
first time to er( 0 ) after l steps without visiting any sign-flip of previously visited states is described for large n by 


P(l) = Prob ({< 71,0 7 ^ ±1}: {? 2 ,« ^ ±l|s = 0,1}, • • • , {qi-i, s 7 ^ ±l|s = 0,1,2}, qi t0 = 1) 


1 — 2 


s=0 


= (1 - 2pi )0 ) JJ(1 - 2p 2)S ) JJ(1 - 2pi- ljS )pi, { 

s =0 

(1-1 t-l 

= Pi ,0 exp EE ln(l - 2 p tyS 


(14) 


\t —1 s—0 
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Note that, in the second line of Eq. m, the factor J|*_ 0 (l — 2pt iS ) describes the probability that the state makes a 
transition at time t to a state distinct from {±er(s)|s = 0,1,..., t — 1}. The final factor, p/ 0 j describes the probability 
of coming back to the initial state <r(0) after l steps. 

(a) (b) 




FIG. 3: There are two kinds of limit cycles if the cycle length l is even, (a) In the first kind of cycles, the cycle closes without 
ever visiting the sign-flip of previously visited states, (b) In the second kind of cycles, the state first makes a transition to the 
sign-flip of the initial state after 1/2 steps, i.e., a(l/2) = —<r(0). If this happens, the cycle must close after l steps. 


Altogether, the probability that a certain state, er(0), belongs to a cycle of length l (revisiting <x(0) for the first 
time after l steps) is described for large n by [l6| 


P( 0 


P{1), (odd l) 
P(l) + P(l/2). (even l) 


(15) 


Notably, the probability takes different expressions for odd and even l. If Z is odd, Eq. m directly gives the 
probability. If l is even, there are two separate kinds of contributions depicted in Fig. [3] The first contribution is 
from cycles that close without ever visiting the sign-flip of their history. The second contribution is from cycles that 
involve a transition at step 1 /2 to the sign-flip of their initial state, which then guarantees that the cycle closes in l 
steps. 

The final step is to evaluate the state concentration probability pt, s - The initial state concentration probabilities 
are simply given by 

Pl,0 « 2~ n = Pinit (16) 

for large n and l as discussed in Sec. IIIII Although this approximation is inaccurate for l < 10, it becomes accurate 
for large n over a wide range of l that includes the typical cycle length (Fig. [p. 



FIG. 4: (Color online) Simulation results of the initial state concentration probability pi } o as a function of l when n varies. The 
results are obtained based on statistics collected from 2 x 10 9 networks. The initial state is always set to <Ji(0) = 1 for all i (see 
Appendix ED). The sampling error increases with n because of exponential increase of the state space. Note that pgo = 2 71 is 
an exact result. 

On the other hand, the state concentration probability pt+i,t at t > 1 is computed sequentially by Eqs. m and m 
In particular, this probability quickly converges within several steps ( t c ~ 5; see, Fig. [2] (a)) to the quasi-stationary 
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value of 


Poo = lim pt+i, t = exp (na(l)) (17) 

t—t OO 

for any l > 1, where a(l) = —0.46 from Eq. (HOD . That is, the state concentration probability quickly converges in 
several steps from the initial value of pi n it « exp(—0.69n) to the asymptotic value p^ « exp(—0.46n). 

Therefore, P(l) of Eq. (fT4l) can be further approximated using p; n it and poo by 


m 


Pinit exp 


Pinit exp 


Pinit exp ( - — 


l-l t -1 

_ 2v °°>+ ° 

t—1 s =0 

l 2 

— ln(l - 2poo) 



Poo Pinit \ 
1 - 2p oc ) 


(18) 


where r = \f—2/ ln(l — 2poo) is the characteristic cycle length that grows exponentially with the system size, con¬ 
sistent with the numerical observations [l9j. Note that, in the first line of Eq. we used the relationship that 

|Poo — Pts I < |Poo — Pinit | (for any t and s; see, Fig. [2]) to upper-bound the deviation of pt s from poo- To make the 
contribution of the O (2 t c l p (l 2 p init ) negligible, the approximation in the second line assumes 


4f r 


Poo Pinit 


-(1 - 2poo) ln(l - 2poo) 


< l and l <C 


l-2p 0 


2tc(Poo Pinit) 


(19) 


The first condition in Eq. m requires that —^ ln(l — 2poo) 2, while the second condition ensures that 
2 t c l 1. The range of l specified by Eq. (fliill is roughly 10 Z exp(0.46n)/(2f c ) at n > 10. Hence, the 
characteristic cycle length r « exp(0.23?r) is well within this range. Incidentally, r is known to also characterize the 
typical transient time scale to enter a limit cycle fl6j |. 
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cycle length 


FIG. 5: (Color online) (a) Probability distribution of cycle lengths, (b) Cumulative distribution of cycle lengths. The numerical 
data is obtained from 1000 samples for n = 10, 500 samples for n = 15, and 200 samples for n = 17. The inset shows an 
enlarged view at small cycle length. 

The probability of observing a cycle of length l is given by P(l)/(IZ) with a normalization constant Z = i -P(0 A- 
In this expression, the probability, P(l), of a state belonging to a cycle of length l should be divided by l to provide 
the cycle length probability since all states within a cycle share the same cycle length. Note that the normalization 
constant Z represents the probability of a state belonging to a cycle (attractor). Figure [5] (a) shows the comparison of 
numerically obtained cycle length probability with its theoretical estimate. Numerical details to collect the statistics 
of the attractors are given in the Appendix [D] The theory nicely captures this probability at around the characteristic 
cycle length, including the difference in probability for odd and even cycle lengths, as n becomes large. However, 































(b) 


( a ) 




FIG. 6: (Color online) The first (mean) and second moment of the cycle length distribution. Theoretical predictions and 
numerical simulations are compared. The results are averaged over many random realizations of the networks (from 1000 
samples for n = 10 to 100 samples for n = 18). 


the deviation is large for non-typical l in finite networks. The cumulative distribution of cycle length is similarly 
obtained by F(l) = (1 /Z) Y^i'=i -P(0/^ ~ ( Ii F(l')/l'dl' + J^ 2 P(l') / IZ . The comparison of F(l ) with the 

numerical results is shown in Fig. [5] (b). The discrepancy tends to become small for larger n (see the inset of Fig. [5] 

(b))- 

The first moment (mean value) and the second moment of the distribution can be computed analytically as well. 
Their values are given by: 


(0 

(O 


[1 — erf(l /t)\ 
3 fy T ^dt - 

2 r 2 e - 1 /" 2 


L 


OO 

l/r : 


dt 


( 20 ) 

( 21 ) 


where erf(x) = f* e t2 dt and f™ r 2 ~ — je — a(l)n in the large n limit, where 7 e = 0.5772 is the Euler 

constant. The theoretical predictions are compared with the numerical results in Fig. 0 The exponential growth 
of the typical cycle length is verified, which suggests that chaotic attractors exist in the state space of a randomly 
connected neural network. 

Following the same spirit, one can derive the number of attractors as 2 n Z ~ —3a(l)n/4 — 37 e/ 4, from which a 
linear dependence [l^llIlQil is confirmed (see also Fig. [7] (a), the linear fit gives the slope 0.360 ± 0.010, compatible 
with the theoretical value 0.342). Another interesting quantity is the number of the attractive states _/V at t belonging 
to all cycles (e.g., a cycle of length l has l attractive states), which is expected to grow exponentially with the network 
size n. This quantity is evaluated by our theory as N at t = 2 n X)?=i -P(0 j anc i can be quantified as the growth rate 
(entropy density) s = linin^oo ■Mn./V att . In the large n limit, we obtain s = —a(l)/2, which is compared with the 
numerical results at finite n. As shown in Fig. [7] (b), as n increases, s decreases, approaching the asymptotic limit 
0.2277. 

The deviation at small n (or at l far from the characteristic length, see Fig. 0 comes from three approximations. 
One is Eq. (ED, which becomes invalid at small l where p; n it also depends on l, but Eq. (USD becomes reasonable for 
large l (as occurs in our case where the typical cycles are long). The second one is Eq. (1181) . This approximation is 
valid in the range of l specified by Eq. (USD. Note that this condition is consistent with the numerical results shown 
in Fig. 0 The last approximation is Eq. (0, which breaks down for small n at which two or more time-steps memory 
should be considered. In the large network size limit, these approximations become exact and the dynamics can be 
described by a Markovian process in terms of the state overlap. Thus, as we focus on the structure of attractors at a 
relatively large but finite n, these effects are not significant. 
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FIG. 7: (Color online) (a) Linear dependence of the number of cycles on network size n. (b) Entropy density of the attractive 
states defined by s = — ln7V at t. As n increases, the numerical data gets to the theoretical prediction. 


VI. CONCLUSION 

In this work, we studied the deterministic dynamics of a randomly connected neural network and proposed a simple 
Markovian stochastic process to describe the evolution of the overlap of two states along the dynamics trajectories. 
The properties of the state concentration can be studied by a mean field computation, and furthermore, the theoretical 
cumulative distribution of cycle length is compared with the numerical simulation results. The typical length of cycles 
is predicted and observed to grow exponentially with the network size. The number of attractive states on all cycles 
has also an exponential growth with the network size, and its typical value can also be predicted by our theory. 

Our theory should have potential to be generalized to treat more complex situations, e.g., couplings between neurons 
are correlated, where one time-step memory is not enough to describe the dynamics and strong memory effects induced 
by retarded self-interaction could be incorporated by introducing a back-action field (two time-steps memory) [18}. The 
current analysis is also restricted to the parallel type of dynamics, whereas, the sequential (asynchronous) dynamics 
seems to be more natural, and our current method may apply to this type of dynamics, although the computation 
will become more complicated. However, the statistical properties of attractors would not change qualitatively, as 
expected from numerical simulations flU . 

Our work is expected to provide insights towards understanding how the neural network processes information and 
stores temporal sequences [29j, which will be left for future study. 
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Appendix A: Dynamic functional integral method 

We compute the dynamics of the state overlap using the dynamic functional integral (mean-field) method (see Q for 
similar calculations). In this section, we express time indices as lower case characters, e.g., h t (t) = ha, and follow the 
convention that summations are neglected if the same indices appear twice in an expression, e.g., JN 

Let us first define the ensemble of state trajectories, h = {hu\i = 1 ,... ,n, t = 0,1,...}, averaged over different 
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networks: 


P(h) = 


8 {hit dij ) 




(Al) 


J j 


where (5(-) is the Dirac delta function and [-]j is the average over the random couplings. In the following, we denote 
by (•) an average with respect to P(h). 

The joint distribution of the overlap q = {(?t s }t> s is 


\t>S 


p(q) = (II' 5 

s 

Ii dh 


Qts 

n 


il <5 (hit dijOjt) 


n 


dhjtdlij 

2n 


n* 

- t>s 


Qts &jt&js 
n 


Gxp ^ihghg j ^ 


t>s 


Qts t t 

n 


(A2) 


n 


dhitdhi, 

2n 


exp (i hithtt - —hithi S ajtcrj S ] 


£>s 


Qts &it&js 
n 


n 

i,t 

n 

\.t>s 

n 

\.t>s 


dhgdh it 
2n 


| i dghg yQtsh'ith'isj ^ 


t>s 


Qts Ojt^- 


ndq ts 
2i r 

ndq ts 

2n 


n 


dhudhi 

27T 


1 


exp n\q ts qts - i qtsVjtVjs + ih it h it - -qtsKtK. 


exp (~nf(q,q)), 


where we have defined the action 


/(<7, q) = -i qtsqts - In J dH exp £, 


C = ihfht - -qtsh t h s - ig ts cr t (T s , 


(A3) 


and dH = Y\ t {dhtdht) / {2n). In the derivation, we have used the Fourier transformation of the delta function 
8{x) = J dx/(2w) exp(ixx) for each delta function in Eq. (IA2I) . and we have taken the average over the independent 
Gaussian variables {,/ 1? } of mean 0 and variance 1/n. For large n, the distribution of Eq. (IA2I) is well approximated 
by a Gaussian distribution, where the peak is specified by the saddle-point equations: 


n d f - j. i/i, i, \ 

° = d^ = ~ lqts+ 2 {hths)c ' 


(A4) 


0 = 


df 
di q ts 


= -qts + (crt<J s )c, 


with average {■)c = (/ • e c dH)/(f e c dH). We can easily see that q = 0 is a solution of Eq. (IA4I) |5j. Hence, if q = 0, 
the average {-)c 0 — (‘)>C|q=o is an average over Gaussian h of mean ( ht)c 0 = 0 and covariance (hth s )c 0 = qts, which 
simplifies the saddle-point equation of Eq. (IA4I) in terms of a closed-form expression of q by 

Qt+i,s+i = (<xt+i<x s +i)c 0 (A5) 

= JJ DxDy sgn(x)sgn (q ts x + yjl- q? s y 

= <p(qts), 
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with ip(q) = (2/ir) arcsin(< 7 ) and a Gaussian measure Dx = exp(— x 2 /2)/\f2jr. Note that |</?(g)| < \q\ and the equality 
holds only at q = 0 and q = ±1. Hence, unless q = 1 initially, the overlap rapidly converges in a few steps to zero in 
the n —> oo limit. 

The order parameters fluctuate around the saddle-point solution of Eq. (IA5D for finite n. This fluctuation of q and 
q is characterized to the leading order by the Hessian matrix of f, i.e., 


Ats,t's' 
Bts.t's' 
Cts,t' s' 


d 2 f = 1 d(h t -h s -)c 0 
dqtsdqt's' 2 dqts 


d 2 f 

dqtsdqt's’ 

d 2 f 

dqtsdqt's' 


— —iT i 1 

= 0’s'} Co 


d{<J t 'Cr s ’)c 0 

dq ts 


0 J 


(A6) 


for t > s and t' > s' , where the Hessian matrix is evaluated at the saddle-point solution of the order parameters, i.e., 
q = 0 and the solution of Eq. (1A5I) . 

In the current setup, the Hessian matrix is simply given by 


At s ,t's' — 0, (A7) 

Bts,t's' — 1 $t,t'&s,s' T i^ (qts'j^t',t+i3s',s+h 

Ct s ,t's ' = + 0(q~), 

for t > s and t' > s', where tp'(q) = dtp(q)/dq. Note that the 0{q 2 ) contribution in Ct s ,t's' can be more explicitly 
estimated, for example by applying Plackett’s approximation [28j |. Here, we would like to evaluate the (n multiplied) 
covariance of the overlap parameter, A ts ,t's' = nCov[qt s , qt's'}- By applying the matrix inversion lemma, we find that 
its inverse is 


A- 1 = A - BC~ 1 B t = (iB)(iB) T + 0(q 2 ). 


(A8) 


This relation indicates that for small q the linear combination, 

rj = y/n(iB) T Sq, (A9) 


of the fluctuation of the overlap parameter, dq, is white Gaussian random variables. To see this, one can apply the 
transformation of variables and find that 


P(rj) 


J S (r) — y/n(\B) T 8q) exp — dq 1 A 1 dq s jddq 



Thus, Eq. (IA9I) indicates that the finite-size fluctuations of the order parameter are described by 


(A10) 



i^t'^t^s'^s {.qt's' )^,t / +1^5,S / + l] dqt's' 

5q ts ~ ip'{q t -i,s-i)dqt-i,s-i- 


(All) 


Altogether, summarizing that qts = p{qt- i,s-i) in the n —> oo limit and that the finite-size correction is described by 
Eq. (1A11I) . we obtained, for finite n, 


q ts = <p{qt- i, s -i) + —j=qts + 0 {q 2 ), 
\/n 


(A12) 


which is a simple Markovian process that involves white Gaussian noise of variance 1 /n. 

Recalling the definition of the overlap parameter, q t+ i, s +i = sgn(h^)sgn(/ij S )/n, and that hi with different i tend 
to become independent in the n —> oo limit, we know that the overlap parameter must be distributed approximately 
according to a binomial distribution. Extrapolating this observation, the result of Eq. (IA12I) is consistent with the 
Markovian dynamics of 


P{qt+ m+i) — 


W{q t+ i }S +i\qts)P{qts)dqts 


(A13) 
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with the binomial transition probability 


W 


(^Qt+ i,s+i 


2 m — n 
n 



^ l + y(g ts ) y m ^ 1 - 


(A14) 


where m indicates the number of units taking the same state at time t + 1 and s + 1. 

In summary, this result shows that the Markovian dynamics of Eq. (1A13I) provides a good approximation of the 
dynamics of the overlap parameter once 0(q 2 ) terms become negligible near the stationary state. 


Appendix B: The dynamics of the state overlap does not depend on the initial state cr(0) 


A specific choice of the initial state <x(0) is not important to study dynamics of the state overlap for random 
ensemble of networks as long as er(0) is selected independently of the network connections {Jij}- Without losing 
generality, we can set cq(0) = 1 for all i. 

To see this point, we consider a simple transformation of variables, 

&i(t) = CTi{t)(Ti( 0). (Bl) 

The state overlap is also described in terms of these transformed variables by qt 8 = (1 /n) JA eb(f)b';(.s) and the initial 
state is given by <jj(0) = 1 for all i. 

These transformed variables follow the same update rule as the original one, 


di{t + 1) = sgn 


£• 


'XjUj 


( t ) 


(B2) 


except that the coupling matrix is given by = crj(0) <jj (0) instead of J^. Notably, the distribution of {Jij} 
is the same as that of {Jij} as long as er(0) is chosen independently of {Jij}- Therefore, to study the dynamics of 
the state overlap, we can alternatively study the dynamics of these transformed variables with the initial condition 
{d 2 (0) = l|i = 1,2, 


Appendix C: The dynamics of the state overlap in random Boolean 


networks 


The dynamics of the state overlap in random Boolean networks is described by Eq. <[9|) 
is simply 


with (fiBN(q) = Sq, i, which 


c*t+i,s+i{q) 


H{q)- ln2 + a t * s , (q ^ 1) 

max{a ts (l),-ln2 + a* s }, (q = 1) 


(Cl) 


where a} s = max ? /^! a ts {q')- Let us assume that there is no perfect overlap of states initially, j.e., Prob(® ! o = 1) = 0. 
This means that 07 , 0 ( 1 ) < — In2 + o* 0 and lim n _ >00 o* 0 = 0, because the initial overlap distribution Pi,o(q) = 
exp(no7,o(<z)) must be normalized. Thus, the dynamics of Eq. (IC1I) converges in one step to a stationary solution 

a(q) = H{q) — In 2. (C2) 


Moreover, we have from Eq. CEB 


P(q') = hi I ±l^Sl + H{q') (C3) 

/ H{q') — In 2, {q' 1) 

- \o. W = 1 ) 


This indicates that states mainly concentrate from q = 0 if they do not already concentrate. 

This analysis also provides important information about the eigenvalues of the transition matrix W at the large 
network size limit. The first eigenvalue is trivial, Ai = 1, with the eigenfunction fi(q) = <5 g ,i, indicating that states 
never separate once they concentrate. The second eigenvalue, A 2 « 1, is a non-trivial one that corresponds to the quasi¬ 
stationary state with the eigenfunction fz{q) = exp (na(q)), where a(q) is given by Eq. (IC2I) . The other eigenvalues 
A k for k > 3 are all zero because the distribution of the overlap converges in a single step to the quasi-stationary 
state. Furthermore, the fact that state concentration happens with probability 2~ n at each time step suggests that 
A 2 = 1 — 2 -n . 
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Appendix D: Simulation details of the dynamics 

The total number of states is 2 n . They form a state set called S. We also denote a path set V recording the states 

on a dynamics trajectory. Only the state index is stored in both sets. 

Step 1. Choose the first state er 1 in S as a starting point for the parallel dynamics, and remove this state from S at 
the same time. 

Step 2. er 1 evolves to er' by one step of the parallel dynamics (all neurons’ states are updated for one time). 

Step 2.1. If er' £ iS, remove it from S, put the index of er' into V, and continue to perform the parallel dynamics, 
i.e., let cr 1 = er', then go to Step 2; 

Step 2.2 Otherwise, compare cr' with the one in the set V and if they coincide with each other, a new cycle is 
identified and the length is recorded at the same time, then go to Step 3; otherwise, no new cycle is found 
and go to Step 3. 

Step 3. Go to Step 1 until the set S becomes empty. 
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